Business Analytics

Introduction and Resources

Ayush Patel and Jayati sharma

09 February, 2024

Before we begin

Please refer to the course landing page for accessing all lecture slides and other resources related to the course.

Make sure that you have the following packages installed in R.

install.packages("pacman")

pacman::p_load(tidyverse, rmarkdown, 
               tinytex, ISLR, ISLR2, 
               openintro, opendatatoronto,
               causaldata)

About me

I am Ayush.

I am a researcher working at the intersection of data, law, development and economics.

I teach Data Science using R at Gokhale Institute of Politics and Economics

I am a RStudio (Posit) certified tidyverse Instructor.

I am a Researcher at Oxford Poverty and Human development Initiative (OPHI), at the University of Oxford.

Reach me

ayush.ap58@gmail.com

ayush.patel@gipe.ac.in

About the course

What this course will cover

  • How to do exploratory and descriptive data analysis
  • Effective data visualization
  • Literate Programming
  • Learning to work with important datasets and prepare reports

What this course will NOT cover

  • All statistical techniques that exists
  • All the mathematics behind a statistical technique
  • Computer science

RStudio projects - Why?

Content for this topic has been sourced from Danielle Navarro’s workshop. Please check out her work for detailed information.

The core idea behind RStudio Projects, in their own words, is that

RStudio projects make it straightforward to divide your work into multiple contexts, each with their own working directory, workspace, history, and source documents.

  • Why should I use RStudio projects?
    • Convenience
      • Keeps things tidy
      • Smooths the process
    • Functionality
      • RStudio projects have a .Rproj file, this serves as an anchor for other packages
      • Also useful for sharing code with other people

RStudio projects - How?

Content for this topic has been sourced from Danielle Navarro’s workshop. Please check out her work for detailed information.

Go to the the little blue menu in the top the top right corner in RStudio, click on the dropdown menu, and select “New Project”.

This will bring up a dialog box that provides a few different options. If you are going to work in a new folder, choose ‘New Directory’. If you want to work in an existing folder, select ‘Existing Directory’ and browse for the location.

RStudio Projects - How?

Once you’ve created the project, if you have a look at the folder in Windows Explorer / Mac Finder, you’ll see your file!

Data Wrangling - What and Why

What is data wrangling?

  • It is not often that you get the data you want to work with in a specific format. You will always want to add, delete, filter or transform variables.
  • Also, when you start exploring your data, you would want to create summaries of variables.
  • This transformation or manipulation of data is called data wrangling.

Why should I learn it?

  • Because you will almost never get data that is already in the format that you want to work in.
  • Through this course, you will learn data wrangling techniques in R using the dplyr and tidyverse packages.

Data Visualization

What is data visualization?

  • Graphical representation of data
  • Converting large amounts of data into easily comprehensible visuals

Why should I learn it?

  • To present data in an appealing manner
  • To understand patterns from data
  • Through this course, you will learn data visualization techniques in R using the ggplot2 package.

Literate Programming

Content for this topic has been sourced from Stephanie Hicks’s lecture. Please check out her work for detailed information.

What is Literate Programming?

  • Make writing reproducible reports easier is known as literate statistical programming
  • Think of a report or a publication as a stream of text and code
    • The text is readable by people and the code is readable by computers
    • The analysis is described in code chunks
    • Each text chunk will relay something in a human readable language

Why should I learn it?

  • Because the aim is not just to learn to write code, but to write it in a way that it is readable and reproducible
  • Through this course, you will learn literate programming skills in R with R Markdown

Exercise

  • By next week, set up your own Github account and website
  • You can use the help of this resource
  • ALSOO, fully completing this gives you 3 marks as extra credit :)

Thank you :)